627 research outputs found

    The Complexity of the Simplex Method

    Get PDF
    The simplex method is a well-studied and widely-used pivoting method for solving linear programs. When Dantzig originally formulated the simplex method, he gave a natural pivot rule that pivots into the basis a variable with the most violated reduced cost. In their seminal work, Klee and Minty showed that this pivot rule takes exponential time in the worst case. We prove two main results on the simplex method. Firstly, we show that it is PSPACE-complete to find the solution that is computed by the simplex method using Dantzig's pivot rule. Secondly, we prove that deciding whether Dantzig's rule ever chooses a specific variable to enter the basis is PSPACE-complete. We use the known connection between Markov decision processes (MDPs) and linear programming, and an equivalence between Dantzig's pivot rule and a natural variant of policy iteration for average-reward MDPs. We construct MDPs and show PSPACE-completeness results for single-switch policy iteration, which in turn imply our main results for the simplex method

    A flowering-time gene network model for association analysis in Arabidopsis thaliana

    Get PDF
    In our project we want to determine a set of single nucleotide polymorphisms (SNPs), which have a major effect on the flowering time of Arabidopsis thaliana. Instead of performing a genome-wide association study on all SNPs in the genome of Arabidopsis thaliana, we examine the subset of SNPs from the flowering-time gene network model. We are interested in how the results of the association study vary when using only the ascertained subset of SNPs from the flowering network model, and when additionally using the information encoded by the structure of the network model. The network model is compiled from the literature by manual analysis and contains genes which have been found to affect the flowering time of Arabidopsis thaliana [Far+08; KW07]. The genes in this model are annotated with the SNPs that are located in these genes, or in near proximity to them. In a baseline comparison between the subset of SNPs from the graph and the set of all SNPs, we omit the structural information and calculate the correlation between the individual SNPs and the flowering time phenotype by use of statistical methods. Through this we can determine the subset of SNPs with the highest correlation to the flowering time. In order to further refine this subset, we include the additional information provided by the network structure by conducting a graph-based feature pre-selection. In the further course of this project we want to validate and examine the resulting set of SNPs and their corresponding genes with experimental methods

    A Kernel Method for the Two-sample Problem

    Get PDF
    We propose a framework for analyzing and comparing distributions, allowing us to design statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS). We present two tests based on large deviation bounds for the test statistic, while a third is based on the asymptotic distribution of this statistic. The test statistic can be computed in quadratic time, although efficient linear time approximations are available. Several classical metrics on distributions are recovered when the function space used to compute the difference in expectations is allowed to be more general (eg.~a Banach space). We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests

    Hilbert Space Representations of Probability Distributions

    Get PDF
    Many problems in unsupervised learning require the analysis of features of probability distributions. At the most fundamental level, we might wish to determine whether two distributions are the same, based on samples from each - this is known as the two-sample or homogeneity problem. We use kernel methods to address this problem, by mapping probability distributions to elements in a reproducing kernel Hilbert space (RKHS). Given a sufficiently rich RKHS, these representations are unique: thus comparing feature space representations allows us to compare distributions without ambiguity. Applications include testing whether cancer subtypes are distinguishable on the basis of DNA microarray data, and whether low frequency oscillations measured at an electrode in the cortex have a different distribution during a neural spike. A more difficult problem is to discover whether two random variables drawn from a joint distribution are independent. It turns out that any dependence between pairs of random variables can be encoded in a cross-covariance operator between appropriate RKHS representations of the variables, and we may test independence by looking at a norm of the operator. We demonstrate this independence test by establishing dependence between an English text and its French translation, as opposed to French text on the same topic but otherwise unrelated. Finally, we show that this operator norm is itself a difference in feature means

    RNA atlas of human bacterial pathogens uncovers stress dynamics linked to infection

    Get PDF
    Bacterial processes necessary for adaption to stressful host environments are potential targets for new antimicrobials. Here, we report large-scale transcriptomic analyses of 32 human bacterial pathogens grown under 11 stress conditions mimicking human host environments. The potential relevance of the in vitro stress conditions and responses is supported by comparisons with available in vivo transcriptomes of clinically important pathogens. Calculation of a probability score enables comparative cross-microbial analyses of the stress responses, revealing common and unique regulatory responses to different stresses, as well as overlapping processes participating in different stress responses. We identify conserved and species-specific 'universal stress responders', that is, genes showing altered expression in multiple stress conditions. Non-coding RNAs are involved in a substantial proportion of the responses. The data are collected in a freely available, interactive online resource (PATHOgenex). Bacterial stress responses are potential targets for new antimicrobials. Here, Avican et al. present global transcriptomes for 32 bacterial pathogens grown under 11 stress conditions, and identify common and unique regulatory responses, as well as processes participating in different stress responses.Peer reviewe

    A Principled Approach to Analyze Expressiveness and Accuracy of Graph Neural Networks

    Get PDF
    Graph neural networks (GNNs) have known an increasing success recently, with many GNN variants achieving state-of-the-art results on node and graph classification tasks. The proposed GNNs, however, often implement complex node and graph embedding schemes, which makes challenging to explain their performance. In this paper, we investigate the link between a GNN's expressiveness, that is, its ability to map different graphs to different representations, and its generalization performance in a graph classification setting. In particular , we propose a principled experimental procedure where we (i) define a practical measure for expressiveness, (ii) introduce an expressiveness-based loss function that we use to train a simple yet practical GNN that is permutation-invariant, (iii) illustrate our procedure on benchmark graph classification problems and on an original real-world application. Our results reveal that expressiveness alone does not guarantee a better performance, and that a powerful GNN should be able to produce graph representations that are well separated with respect to the class of the corresponding graphs

    Insular volume abnormalities associated with different transition probabilities to psychosis

    Get PDF
    Background Although individuals vulnerable to psychosis show brain volumetric abnormalities, structural alterations underlying different probabilities for later transition are unknown. The present study addresses this issue by means of voxel-based morphometry (VBM). Method We investigated grey matter volume (GMV) abnormalities by comparing four neuroleptic-free groups: individuals with first episode of psychosis (FEP) and with at-risk mental state (ARMS), with either long-term (ARMS-LT) or short-term ARMS (ARMS-ST), compared to the healthy control (HC) group. Using three-dimensional (3D) magnetic resonance imaging (MRI), we examined 16 FEP, 31 ARMS, clinically followed up for on average 3 months (ARMS-ST, n=18) and 4.5 years (ARMS-LT, n=13), and 19 HC. Results The ARMS-ST group showed less GMV in the right and left insula compared to the ARMS-LT (Cohen's d 1.67) and FEP groups (Cohen's d 1.81) respectively. These GMV differences were correlated positively with global functioning in the whole ARMS group. Insular alterations were associated with negative symptomatology in the whole ARMS group, and also with hallucinations in the ARMS-ST and ARMS-LT subgroups. We found a significant effect of previous antipsychotic medication use on GMV abnormalities in the FEP group. Conclusions GMV abnormalities in subjects at high clinical risk for psychosis are associated with negative and positive psychotic symptoms, and global functioning. Alterations in the right insula are associated with a higher risk for transition to psychosis, and thus may be related to different transition probabilitie
    corecore